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Executive Summary 


Methodology for Evaluation of Performance and Products of Scanning Service 
Providers 


IPI was contracted to consult on methods to evaluate the performance and products of 
scanning service providers hired to convert Library materials from paper, flat film, and 
microfilm to digital images. The Library is first of all looking for guidance for the measure- 
ment of spatial resolution. The report shall recommend quality assurance procedures to be 
used to create measurable images, together with a description of the tools and/or devices 
needed to measure the outcome of the resulting images. Specifically, the NDLP is seeking 
technical guidance regarding such concerns as how to perform tests after scanning to mea- 
sure capture resolution (and tonality) by interpreting or reading test targets or by other 


means which may be suggested. 


Overview 


The specific questions in the contract about spatial resolution measures cannot really be 
addressed independently—they must be part of the larger picture encompassing other image 
quality parameters and work-flow issues. 

The best approach to digital image quality control includes, on one hand, subjective 
visual inspection using monitors and printers and, on the other hand, objective measure- 
ments performed in software on the digital files themselves. 

Efforts should be made to standardize the procedures and equipment for subjective 
evaluations by means of monitor and printer calibration. For objective image quality mea- 
surement, software should be available which is designed to locate and evaluate specific 
targets and then to report numbers or graphs describing key image quality parameters. Such 
software should ideally be a plug-in to a full-featured image browser so that review of all 
aspects of the image file (header info, index and tracking data, etc.) can be done at once. 

A key point is that targets and the software to evaluate them are not just for vendor 
checking—they serve to guarantee the long-term usefulness of the digital files and to protect 
the investment the Library has made in creating them. Known targets that “ride” with every 
group of images allow for graceful migrations from one file format or computer platform to 
another. This additional information could be part of the image header or a separate file 


linked to specific batches of image files. Internal quality reference points raise the likelihood 


that images can be dealt with in large batches in a controllable way, thereby leveraging the 
investment in all the scanning done in the first place. With digital files, targets have a con- 
tinuing function that they did not have in conventional microfilming—they are like trail 
markers through all the manipulations and migrations. 

No adequate off-the-shelf target and software solutions are commercially available. The 
Library will have to further examine its requirements and perhaps become involved as a 


partner with commercial vendors to create a product to meet its specific needs. 


Suggested Approach to Vendor Qualification 


To achieve the goal of building an archive of long-term value, a whole set of image 
quality issues should be looked at. IPI suggests including tests on tone reproduction, modu- 
lation transfer function, and noise in the quality test. For some materials, tone reproduction 
is not valid. 

The proposed method includes two stages: first, the qualification of the vendors and, 
second, a quality measurement tool that can be used by customers to assure scanning quality 
over time. It should be kept in mind, however, that it might be difficult to carry out this 
second stage, because the vendor might be opposed to it. Nevertheless, the experience of 
similar projects has shown that once vendors accept the testing they make it part of their 
work flow because it helps them find and get rid of flaws in their equipment. 

Tests should be conducted by scanning a given calibrated test target and evaluating the 
digital file with the help of a specially developed software program. The software should 
ideally be a plug-in to a full-featured image browser like Photoshop. 

The rigorous image quality test suite for initial vendor qualification that should be 
included in the request for proposals involves tests for all three quality parameters (and 
color reproduction as well, in the case of color images); it includes test targets and sample 
images. Once the selection is made and the scanning has started, vendors should be checked 
on a regular basis. These routine quality-assessment tests can be less rigid and include only 
the key parameters (tone reproduction, resolution, color reproduction). Full versions of the 
targets could be scanned every few hundred images and then linked to specific batches of 
production files, or smaller versions of the targets could be included with every image. The 
noise of the hardware used should actually not change, unless the scanner operator changes 


the way he works or dirt is built up in the system. 


Rigorous Vendor Qualification 


One of the first steps should be to rule out so-called image defects such as dirt, “half 


images,” skew, and so on. Whether some preliminary image manipulations (such as resam- 


pling) were done should be considered. 


After this initial step, three different aspects of the scanned images should be looked at: 


e ‘Tone reproduction 
e Detail and edge reproduction (MTF) 


e Noise 


For future projects a test for color reproduction should be included. 


Each one of the above classes needs special targets for the different forms of images 


(e.g., prints, transparencies, etc.). 


To complete the image quality framework, a number of other tests and procedures 


should be included in the future. Some scanner manufacturers already have software in place 


to test their equipment, but it is usually very specific. 


For every image, corrections should be made to take care of the different “amplifying 


factors” of the CCDs and the nonhomogenous illumination in the scanner. Testing for flare 


(stray light in the optical system) might be necessary when scanning transparencies with a 


large dynamic range. In the case of color scans, a test for the registration (precise overlap- 


ping) of the three color channels is necessary 


Tone Reproduction 


This test will show how linearly the system works 
regarding the density values (a logarithmic unit) of the 
original. Achieving linearity for scanner output means that 
the relationship of tonal values of the image is not distorted. 

Linearity and highest achievable density depend on the 
optics of the systems (e.g., flare) but also on the A/D (ana- 
log to digital) converter of the system. Therefore, the highest 
achievable density of a system is a combination of optics and 
the electronics used in the A/D converter. 

Reproducing the gray scale correctly usually does not 
result in optimal reproduction of the images. Efforts are 


under way in industry to automate the task of optimal image 


fe 
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Fig. 1 Calibrated gray-scale test 
targets serve as a link back to the 
reality of the original document 
or photograph. 


Fig. 2 Using a calibrated gray scale and 255 
the resulting digital values, the linearity of 

the scanning device can be determined. The 

reflection or transmission density of each 

step of the gray scale can be measured with Digital 
a densitometer Plotting these values against Value 
the digital values of the steps in the image 

file show the performance of the scanning 

device over the whole range of densities. 

Specifications will have to be worked out 0 
regarding how big a deviation from the 

linear curve is acceptable. 


0 1 2 3 4 
Density 


reproduction with the use of a so-called Automated Pictorial Image Processing System. 
However, the gray scale is used as a trail marker for the protection of NDLP’s investment in 
its digital scans; having a calibrated gray scale associated with the image makes it possible to 
go back to the original stage after transformations, and it also facilitates the creation of 


derivatives. The gray scale could be part of the image, or the file header could contain the 


digital values. 
Max. Max. 
(A) (B) 
Number Number 
of Pixels of Pixels 
0 0 
0 255 0 255 
Gray level Gray level 
Max. 
(C) 
Number 


Gray level 


Fig. 3 Histograms of the image files can be used to check whether all digital levels from 0 to 255 are used 
(A), whether any clipping (loss of shadow and/or highlight details) occurred during scanning (B), or 
whether the digital values are unevenly distributed as can be the case after image manipulation (C). 


Detail and Edge Reproduction (MTF) 


To test the detail and edge reproduction of an imaging system, an edge or a sine-wave 
target should be scanned. In addition, the sine-wave target will make it possible to check for 
resampling problems. 


In the case of document or microfilm scanning, a test target for legibility should also be 
included. 


AIIM SCANNER TEST CHART #2 


— 


l Fig. 5 Sine Patterns sine-wave target. The sine 
Fig. 4 Upper part of AIIM scanner test chart waves in the two center rows of the image are 


#2 containing different typefaces for “legibility used to calculate the MTE The MTF shows how 

tests.” much of the modulation that was in the original 
image made it into the pixel values of the 
scanned image of the target. 
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Fig. 7 Bar targets with converging lines, like this 
star pattern, can be used to visually check the so- 
called cut-off frequency of the system (i.e., the 
smallest ftatures that can be resolved), but they 
cannot be used to get information on how the system 
is working for all the different frequencies in the 
image. 
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Fig. 6 Resolution test chart developed for 
electronic still photography. The black bars are 
used to calculate the modulation transfer func- 
tion. 


MTF 


Fig. 8 Graph of the modulation transfer function. The 
MTF shows the performance of the scanning system over 
the whole range of frequencies (in the target represented 
by sine waves that are closer and closer together). The cut- 
off point of the system represents the highest frequency (the 
finest image details) that the scanner is able to resolve. An 
MTF specification that must be met by vendors will have 
to be worked out. 


Cut-Off 


Frequency 


Noise Rus 


The result of the noise test is twofold. First, it 


shows the noise level of the system, indicating how 


many bit levels of the image data are actually useful. 


For image quality considerations, the signal-to-noise 


ratio (S/N) is the important factor to know. 


Recommended Equipment 


To judge images on a monitor, a good monitor/ Density 
video card combination should be available. The PC Fig. 9 Noise refers to random variations 


should have enough RAM to be able to work effi- PROE TEONE NUA VERM NNE 
sented as a random pattern in the 
ciently with the highest resolution images. IPI worked uniform gray patterns. This graph shows 
the RMS (Root Mean Square), a 
statistical number representing noise over 


Nanao FX2-21 Monitor the density levels of the original image. 
#9 Imagine Pro 8MB Graphic Card 


Monitor calibration tool: Colorific from Sonnetech, Ltd. (included with the monitor 


with the following equipment: 


suggested above) 
Intel Pentium 200Mhz motherboard 
64MB of RAM 
Adaptec wide SCSI controller 
Seagate 4.2GB wide SCSI hard drive 
Plextor 6x CD ROM drive 


Next Steps 


The quality review of scanned images incorporates much more than a specification 


issue, e.g., how many ppi are required for scanning the documents. Most of the major 


scanning projects are now going through this phase of setting up a complete framework. 


After the images are scanned, quality control must be applied to ensure that all the 
images have been scanned correctly (no missing images, numbering correct, no 
“half images,” etc.). Part of the control is done by the vendor, part is done by LC 
staff. The Library would like to have a framework to be able to contract out the 
quality control of the scanned images. To assist in creating future RFPs for the 
vendors, a follow-up project should be designed to define image fundamentals and 
to investigate the availability of existing software to do the checking. 

Next steps for the NDLP could include a feasibility study using unified targets and 
trail markers in the three work flows. This would lead to a more exact definition of 
the software requirements needed to calculate the image parameters. 

Another step should be to compile a glossary and provide training tools for in- 


house and vendor scanning technicians. 


Introduction: National Digital Library Project 


To support its growing role in on-line access, the Library has established the National 
Digital Library Program (NDLP), which has as its primary focus the conversion of histori- 
cal collections to digital form. During the next five years, the Library plans to convert as 
many as five million of its more than one hundred million items. The material to be con- 
verted includes books and pamphlets, manuscripts, prints and photographs, motion pic- 
tures, and sound recordings. Some are in their original forms while others have been refor- 
matted as microfilm or microfiche. As America’s national library, the Library of Congress is 
committed to establishing and maintaining standards and practices that will support the 


development of the National Digital Library. 


History of the Consulting Project 


Methodology for Evaluation of Performance and Products of Scanning Service Providers 


IPI was contracted to consult on methods to evaluate the performance and products of 
scanning service providers hired to convert Library materials from paper, flat film, and 
microfilm to digital images. The Library is first of all looking for guidance for the measure- 
ment of spatial resolution. The report shall recommend quality assurance procedures to be 
used to create measurable images, together with a description of the tools and/or devices 
needed to measure the outcome of the resulting images. Specifically, the NDLP is seeking 
technical guidance regarding such concerns as how to perform tests after scanning to mea- 
sure capture resolution (and tonality) by interpreting or reading test targets or by other 


means which may be suggested. 


Article to Be Delivered 


Report of Discussions, Investigations, and Recommendations for Procedures to Be Used, Supported 
by Scientific Conclusions. 


Recommendations for targets and other approaches to measure results for each of the 
groups listed on the table below, together with recommendations for appropriate devices for 
the actual measurement itself. 

An outcome of the discussions and evaluation may be that new custom targets may 
need to be devised for the NDLP for effective scanning from the various media. 

The Library provided a list of different work flows (see table) for which they needed 
guidance. 
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Categories of images: thumbnail (uncompressed TIFF file); IK x 1K, JPEG highly compressed (20:1/30:1) 
for viewing on the screen; 3K x 3K, JPEG slightly compressed (4:1/5:1) for reproduction purposes 


[sence voveo [image type | Resoiton, Sena eie | 
300 dpi 
Bitonal 
Flatbed eter 
scanner À 
Grayscale taap 
Document groaa 300 dpi 
Paper original 
Archetype: 8 1/2 x 11-inch document 300 dpi 
Bitonal 
Camera or , 
other PAPI 
overhead : 
device Grayscale no ap 
or color 300 dpi 
Flatbed Grayscale Image in 1K x 1K-pxl window 
SEANS, oreoler Image in 3K x 3K-pxl window 
Pictorial item 
Reflected light Gr Image in 1K x 1K-pxl window 
Archetypes: aAySCAG 
8 x 10-inch print or-polor Image in 3K x 3K-pxl window 


Item on 11 x 14-inch board with significant 
writing to the edge Via Image in 1K x 1K-pxl window 

intermediate | Grayscale 

(typically or color : : 

: Image in 3K x 3K-pxl window 
film) 

Image in 1K x 1K-pxl window 

Flatbed | Grayscale 
scanner Image in 3K x 3K-pxl window 
Pictorial item 


8 x 10-inch negative or Golar Image in 3K x 3K-pxl window 
4 x 5-inch transparency - 
35mm slide Via l Image in 1K x 1K-pxl window 
intermediate | Grayscale 
(typically or color 
film) Image in 3K x 3K-pxl window 


Microfilm Bitonal 


Microfil 
Archetype: ee Produce image to print to 
35mm film containing images of device g Grayscale typing paper 


documents on letter-or legal-size paper or color 


Transmitted light ee Image in 1K x 1K-pxl window 
Archetypes: y 


ll 


Meeting in Washington (December 15‘ and 16") 

The Library was seeking answers to questions related to spatial resolution in scanning 
documents, books, photographs, and microfilm. However, during our stay more and more 
questions related to other areas of image quality came up. The following report will concen- 
trate on the issue of resolution, but IPI also proposes a framework to deal with other aspects 
of image quality: Especially in the case of photographs, good tone reproduction is crucial for 


a good digital reproduction of the original images. 


Notes from Meeting 

Scanning photographs calls for the most rigid approach to image quality It requires a 
quality framework that addresses all the different points that came up during the discus- 
sions. Resolution, tone reproduction, and monitor calibration are the most critical ones up 
to now. Past scanning projects have shown that interaction with vendors can be problematic 
because a clear common language is lacking. 

Legibility is the main problem in scanning microphotographic materials and paper 
documents. In both cases, illustrations may pose problems. Paper documents often contain 
halftone images that pose a problem for scanning. For microfilms, the quality of halftone 


illustrations is dependent on good microfilming techniques and on the scanning process. 


Issues in Scanning of Books 


The following points were discussed: 

e Legibility is the main criterion for judging the quality of the digitized documents. 

e Guidelines for scanning technicians in existing contracts have been worked out to 
assure a certain level of quality, but new RFPs shall have better-defined requirements. 

e Practical or “doable” solutions are desired—nothing too complicated. 

e The Library wishes to create a vendor-selection process based on objective tests. 
How can spatial resolution be measured? How can you be sure that you get the 
resolution you ask for? 

e Up to this point spatial resolution has not been a problem in the task of scanning 
documents. However, in the process of scanning books, a lot of points do pose 
problems, like the handling of the books, finding the right threshold for the digital 
values that differentiate the background of the page from the text, etc. 

e The quality inspection for vendor selection shall be a pass/fail process. 


e Requirement: the system shall deliver an image with specified ppi. In the case of 
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fonts, the requirement is whether a font is resolvable or not. 

In scanning Civil War photos (11 x 7 inches mounted on boards approximately 11 
x 14 inches in size), there is some fine text on the mounts that should be readable. 
For pictorial images, the screen appearance is the crucial point, meaning that the 
usual output medium is the CRT screen. 

For paper documents, a laser printer is the usual output medium by which the 
users will judge the quality. 

Printed halftone reproductions are a problem for the scanning process. In order to 
deal with the problem of moiré, the images must be dithered. So far, a special 
algorithm from Xerox has proved to be best for that application, but the algorithm 
is only available as part of special scanning hardware. 

QC procedures that are already in place for illustration images: 

- File names are reviewed. 

- Image defects (e.g., obvious defects such as half images, dirt, dust, black or 

white borders, cropping) are looked for. 

- For printed halftones, the lightness of the image is subjectively judged. 
Deciding whether or not an image needs to be dithered is sometimes left up to the 
discretion of the scanning operator. Experience of the operator is very important in 
determining the quality of the scans. 

Software tools in use: PhotoStyler, View Director, Docu Viewer, HiJaak Pro Image 


Viewer 


Microfilm Scanning Issues 


The following is a summary of the points discussed. 


Very few companies offer microfilm scanning services for special documents. The 
scanning of these materials requires special handling, higher standards, and greater 
quality control than the scanning of record materials for government and business. 
Multiple exposures, targets, and frequent changes in reduction ratio, orientation, 
and document size are common to preservation microfilm. In addition, the original 
materials often contain illustrations that need to be captured in halftones. Besides 
the possibility of dealing with scanning halftones, legibility is the main quality issue 


when scanning preservation microfilms. 


Photo Scanning Issues 


The following is a summary of the points discussed. 
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e According to a strict guideline that must be observed in handling some of the 
photographs, originals must not be flipped over (physically). This imposes some 
restrictions on the scanning devices that can be used. Therefore, the use of a digital 
camera (like the Kontron ProgRes) in a reproduction setup might be the appropri- 
ate way to go. 

e The digital images will be used as references, not as preservation surrogates, mean- 
ing that a medium level of quality is required. The amount of storage needed for 
the images is a major concern. 

¢ Text on posters and images should be readable. 

e Monitor calibration is a major concern when looking at images. 

e Daguerreotypes: security issue, high-quality digital image wanted. The daguerreo- 
type collection project showed a lot of problems in the scanning process and the 
digital images obtained were not satisfactory. 

Another issue came up while browsing through the collection: What should be done 

with stereo images? 

It might be an interesting project to present part of the stereo image collection to the 
public in form of anaglyphs (two superimposed images of different colors, such as red and 
green, representing slightly different angular views of the same subject. When viewed 
through filters of the same colors, each eye sees the corresponding left or right image, but 
the viewer perceives a single fused three-dimensional image). There exists a variety of pro- 
grams used in microscopy that could be used (with certain changes) to prepare the stereo 


images. Special red-green glasses would be needed for viewing the images. 


History of Scanning Projects in the Photographic Department 


Many different approaches have been used for scanning. What is needed is a “bigger 
picture” that creates a standardized and documented set of digital images. What approaches 
should be taken for calibrating monitors, “standardizing” viewing images, etc.? 

Once this framework is set up, one of the next steps might be to develop standards for 
the training of scanning technicians. 

How to articulate questions for the vendors is another issue that would become clearer 


if a proper framework including all different scanning parameters were in place. 
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Introduction to Image Quality in Digital Imaging 


As digital imaging emerges from its infancy more and more libraries and archives are 
starting digital projects. Unfortunately, questions of image quality are often neglected at the 
beginning of such projects. Despite all the possibilities for endless copying, distributing, and 
manipulating digital images, image quality choices made when the files are first created have 
the same “finality” that they have in conventional photography These choices will have a 
profound effect on project cost, value of the final project to researchers, and usefulness of 
the images as preservation surrogates. Image quality requirements therefore should be 
established before a digitization project starts. 

The creation of large digital image collections is not likely to be attempted more than 
once a generation. This means that it had better be done right the first time, so being aware 
of the technical nature of the digital images produced is quite important. Building high- 
quality digital image collections requires a long-term strategy for developing digital assets of 
lasting value. 

Image quality has to be determined separately for digital images that result from refor- 
matting photographic, microphotographic and paper documents. For the two latter catego- 
ries, image quality will mean, above all, legibility of all the significant data. For photo- 
graphs, image quality is determined by tone and color reproduction, detail and edge repro- 
duction, and noise. 

The main quality issue for reformatted paper documents is legibility: This is a twofold 
problem: a high enough spatial resolution must be chosen to capture the details of the 
characters and the correct threshold must be chosen to be able to distinguish between 
foreground and background. 

The Library’s functional image requirements for converting microfilms to digital files 
are as follows: 

It is required the digital images contain all of the significant data in the microfilm 
image. Success in retaining significant data will be determined by the legibility of 
the materials to be digitized under performance of this contract; 1.e., when all the 
words, drawings or other markings, or musical notes can be read in the digital 
image as could be read in the document on the microfilm. 

The high level of image quality in original photographs sets a very high standard for 


successful reformatting projects. One reason why there is such high image quality inherent 
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in original collection materials is because large formats were quite common, even for ama- 
teur photographers, until the 1960s. Many photographs represent work of outstanding 
quality and information content. 

In the context of collections in libraries and archives, the success or failure of a digital 
image database system ultimately depends on whether or not it really enhances access or 
promotes the preservation of the collection. This does not necessarily mean that images 
must be of the highest possible quality to meet the needs of the institutions. The way in 
which the digital files are to be used should dictate how high image quality should be. It 
should be remembered that the higher the quality, the more expertise, time, and cost is likely 
to be needed to generate and deliver the digital image. 

The following are three examples of digital image quality criteria for photographic 
images: 

° The digital image is used only as a visual reference in an electronic data base. The 
required digital image quality is low, both in terms of spatial and brightness resolu- 
tion content. The display is usually limited to a screen or low-resolution print 
device. Exact color reproduction is not critical. Additionally, images can be com- 
pressed to save storage space and delivery time. Using duplicates of the originals 
and a low-resolution digitizing device will be sufficient for these applications. 

° The digital image is used for reproduction. The quality requirements will depend on 
the definition of the desired reproduction. Limiting output to certain spatial dimen- 
sions will facilitate the decision-making process. The same applies to tonal repro- 
duction. Currently most digitizing systems will only allow an 8-bit-per-color out- 
put which, if not mapped correctly, does not always allow for precise tonal and 
color reproduction of the original. 

° The digital image represents a “replacement” of the original in both spatial and tonal 
information content. This goal is the most challenging to achieve given today’s 
digitizing technologies and cost. The information content in terms of pixel equiva- 
lency varies from original to original. It is defined not only by film format, but also 
by emulsion type, shooting conditions, and processing techniques. Additionally 8- 
bit-per-color scanning device output might be sufficient for visual representation on 
today’s output devices, but it might not capture all the tonal subtleties of the origi- 
nal. On the other hand, saving “raw” scanner data of 12 or 16 bits per color with 


no tonal mapping can create problems for future output if the scanner characteris- 
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tics are not well known and profiled. Ultimately “information content” has to be 
defined, whether based on human visual resolving power, the physical properties of 
the original, or a combination of both. 

Each of the above quality criteria requires a different approach to the digitization 
process and a different level of resources. The challenge is that not just one but all criteria 
might be required to digitize one collection of photographs. 

Spatial resolution of a digital image, i.e., how many details an image contains, is usually 
given by the number of pixels per inch (ppi). Spatial resolution of output devices, such as 
monitors or printers, is usually given in dots per inch (dpi). 

To find the equivalent number of pixels that describe the information content of a 
specific photographic emulsion is not a straightforward process. Format of the original, film 
grain, film resolution, resolution of the camera lens, f-stop, lighting conditions, focus, blur, 
and processing have to be taken into consideration to accurately determine the actual infor- 
mation content of a specific picture. 

Given the spatial resolution of the files, how big an output is possible from the avail- 
able file size? The relationship between the file size of a digital image, its total number of 
pixels, and consequently its maximum output size at different spatial resolutions can be 
analyzed mathematically The distinction has to be made between continuous-tone and 
halftone output. Going into greater detail is beyond the scope of this report, but such 
information could be part of a glossary for scanning technicians. 


For optimal continuous tone output the ratio between output dots and image pixels 


Film Type Equivalent Number of Pixels 
(very fine grain) 


Film Type Equivalent Number of Pixels 
(medium grain) 


Approximate pixel equivalencies for various film types and formats. 
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should be 1:1. In case of print- 
ing processes that require 
halftone images, 1.5:1 
oversampling (ppi of digital file 
is higher than lpi=lines per 


inch of output) is needed. 


Definition of Image Quality 


There are no guidelines or 
accepted standards for deter- 
mining the level of image 
quality required in the creation 
of databases for access or 


preservation of photographic, 


Fig. 10 Comparison of original printed reproduction and digital 
image on momtor A standardized environment with calibrated 
monitors and dim room illumination is needed for subjective image 
quality evaluation. 


microphotographic, or document collections. As a result, many institutions now starting 


their scanning projects will be disappointed sooner or later, because their choices did not 


take into account future changes in the technology. However, nobody knows what technol- 


ogy will be available in a few years, and choosing the right scanning parameters is a task that 


still needs to be researched. One problem is that the cycle of understanding image quality is 


just beginning for the new imaging technologies. 


For documents the most important quality issue is legibility. A. Kenney states in her 


report: * 


Quality Index (QI) is a means for relating resolution and text legibility While type 


size might be the most important factor in legibility, other factors should be taken 


into consideration, including typeface; the use of italics and boldface; pitch; line 


width; interlinear spacing; counter; background color; printing surfaces; the degree 


of fading, damage, and acid migration; and the quality of the original document. 


Whether it is used for microfilming or digital imaging, QI is based on relating text 


legibility to system resolution, i.e., the ability to capture fine detail. QI may be used 


to forecast the levels of image quality—marginal (3.6), medium (5.0), or high 


(8.0)—that will be consistently achieved on the use copy. The applicability of stan- 


dards established for microfilming—an analog process—to image quality for mate- 


rial converted via digital technology may be open to some debate. While acknowl- 


*A. Kenney and S. Chapman, Digital Resolution Requirements for Replacing Text-Based Material: Methods for 


Benchmarking Image Quality. 


18 


edging differences between digital and analog capture, the C10 Committee devel- 
oped a Digital Quality Index formula that is derived from the Classic Quality Index 
formula used in the micrographics industry. Both formulas are based on three 
variables: the height of the smallest significant character in the source document, the 
desired quality to be obtained in the reformatted version, and the resolution of the 
recording device. 

It should be kept in mind that scanning from an archive is different from scanning for 
prepress purposes. In the latter case, the variables of the process the scanned image is going 
through are known, and the scanning parameters have to be chosen accordingly If an image 
is scanned for archival purposes, the future use of the image is not known, and neither are 
the possibilities of the technology that will be available a few years from now. This leads to 
the conclusion that decisions concerning the image quality of the archival image scans are 
very critical. 

Image quality is usually separated into two classes: 

e Objective image quality is evaluated through physical measurements of image 
properties. In the case of digital imaging this is achieved with special software 
evaluating the digital file. 

e Subjective image quality is evaluated through judgment by human observers. 

The historical emphasis on image quality has been on image physics (physical image 
parameters), also called objective image evaluation. Psychophysical scaling tools to measure 
subjective image quality have been available only for the last 25 to 35 years. 

What is a psychometric scaling test? Stimuli which do not have any measurable physical 
quantities can be evaluated by using psychometric scaling methods. The stimuli are rated 
according to the reaction they produce on human observers. Psychometric methods give 
indications about response differences. There are three established psychometric methods 
which provide a one-dimensional scale of response differences: the method of rank order, 
the method of paired comparison, and the method of categories. The method of rank order 
lets observers order the samples. In a paired comparison test, observers are asked to choose 
between two stimuli based on some criterion. The method of categories requires observers 
to sort stimuli into a limited number of categories; these usually have useful labels describ- 
ing the attribute under study (e.g., excellent, very good, good, fair, poor, unsatisfactory). 


Technical pictorial image quality parameters can be assessed by considering the follow- 
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ing four basic attributes: 

Tone reproduction. Refers to the degree to which an image conveys the luminance 
ranges of an original scene (or of an image to be reproduced in the case of reformatting). It 
is the single most important aspect of image quality Tone reproduction is the matching, 
modifying, or enhancing of output tones relative to the tones of the original document. 
Because all of the varied components of an imaging system contribute to tone reproduction, 
it is often difficult to control. 

Detail and edge reproduction. Detail is defined as relatively small-scale parts of a 
subject or the images of those parts in a photograph or other reproduction. In a portrait, 
detail may refer to individual hairs or pores in the skin. Edge reproduction refers to the 
ability of a process to reproduce sharp edges. 

Noise. Noise refers to random variations associated with detection and reproduction 
systems. In photography, granularity is the objective measure of density nonuniformity that 
corresponds to the subjective concept of graininess. In electronic imaging, it is the presence 
of unwanted energy in the signal. This energy will degrade the image. 

Color reproduction. Equivalent color reproduction is defined as reproduction in 
which the chromaticities and relative luminances are such as to produce the same appearance 


of color as in the original scene. 


—_" 


l= 


Fig. 11 Color management ensures consistent colors through the whole digital chain, from scanner to 
monitor to printer. 
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Proposed Framework (Excluding Color and Subjective Quality Tests) 


The approach to image quality will be different for the three main image categories. 
Spatial resolution applies to all categories. In the case of microfilm scanning, calculating 
MTF does not make sense for bitonal mode. 

For paper documents and microfilms, where legibility is usually the main quality issue, 
special targets to calculate the QI are needed. 

For the first approach to pictorial image quality, this report will concentrate on three 
objective parameters: 

¢ Tone reproduction 

e Sharpness (modulation transfer function) 

e Noise 

We will try to set up a framework that will facilitate the objective measurement of these 
parameters (see below). Discussions with various people from the field have shown that there 
is high interest in having a software tool to easily measure these objective image parameters. 

This approach is the one taken by two projects looked at by IPI. Both projects scan 
targets—one, the Sine Wave Pattern test target and the other, the Resolution Test Chart for 
electronic still photography which is under development—and interpret the digital images 
with a software program that automatically calculates some parameters. 

The issue of color will be excluded from this framework. Nevertheless, IPI thinks it is 
very important for the Library to include some form of color management system for future 
handling of color originals. 

The following are some issues coming up in setting up the framework. Some apply to 
all three work flows, some apply more to pictorial images (see “Factors to Consider in 
Pictorial Digital Image Processing,” J. Holm). 

The flexibility of digital processes makes them ideal for producing aesthetically opti- 
mized images on a variety of output media, from monitors to snapshots to press sheets. 
However, this same flexibility makes the optimization difficult and complicated. In the case 
of digital photography for example, the quality of digital photographs is typically inferior to 
that of conventional photographs. This is due in part to the limitations of current image 
capture and output devices, but incorrect or incomplete processing is also a major factor. 

There are three reasons why a pictorial digital image processor may not be able to 
produce an optimal photographic rendition: lack of expertise, lack of information, and lack 


of time. Designers of conventional photographic systems are aware of this and design 
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systems that optimize photographic renditions to a large extent automatically. 

Digital photography has the potential to optimally render every scene. A unified ap- 
proach to the problem is needed, together with an understanding of its complexity Efforts 
are under way in industry to automate the task of optimal image reproduction with the use 
of a so-called Automated Pictorial Image Processing System (APIP). One of the major areas 
to be defined is the image-associated data required for APIP These are the same data that 
are considered during the manual processing of images. 

Standardized approaches and data forms are required for interchangeability Manufac- 
turers are strongly encouraged to facilitate the complete transfer of the information associ- 


ated with the image files. To quote M. Ester in “Specifics of Imaging Practice”: 


If I see shortcomings in what we are doing in documenting images, they are trace- 
able to the lack of standards in this area. We have responded to a practical need in 
our work, and have settled on the information we believe is important to record 
about production and the resulting image resource. These recording procedures have 
become stable over time, but the data would become even more valuable if there was 
broad community consensus on a preferred framework. Compatibility of image data 
from multiple sources and the potential to develop software around access to a 


common framework would be some of the advantages. 

The following outlines some of the points that have to be considered when going 
through the digital image chain from image capture (scanning) to output; the highest 
resolution files representing the best quality of the reformatting project can be considered 
the “digital masters.” Great care should be taken in producing them; they contain the value 


of the digital collection and require an intensive quality review. 
Digital Capture 


M. Ester proposes the following two approaches for digital capture for a consistent conver- 
sion of a collection into the digital environment. But both of his approaches use a conven- 
tional duplicate as an intermediate product: 

Matching to film. Under this logic the goal is to make the digital image look the 

same as the photographic medium. In favorable circumstances, with color control 

bars in the photograph, one can take densitometer readings from the film, and 


knowing their digital counterparts, use these values as references for the scanned 


image. This process can be applied very consistently to achieve a record of the film. 


Matching to the scene. The logic of this process says that if we have a color bar in a 
photograph and the digital values of the target are known, then as the color bars in a 
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scanned image are adjusted to their correct values the rest of the image is matched to 
these controls as well. We call this matching the “scene,” because it represents the 


photographic scene at the moment the photograph was taken. 


Processing for Storage 


The best way to store image data is to store the raw capture data. Subsequent processing 
of this data can only reduce the information contained, and there is always the possibility that 
better input processing algorithms will become available further on. Input processing removes 
the effects introduced by the scanning process. The image data archived should therefore be 
the raw data, along with the associated information required for processing, whenever pos- 
sible. This data may be compressed to some extent using lossless compression, but in many 
cases it may also be desirable to store image data which has undergone some processing to 
reduce file size. For example, the raw data may be linear and/or contain unoccupied levels, 
meaning that not every digital level actually contains a pixel of the image. Perception is quite 
nonlinear, and 12-bit linear data can be reduced to 8-bit nonlinear data with no perceptual loss 


if the reduction lookup table (LUT) adapts to the dynamic range of the data. 
Processing for Viewing 


Processing for viewing is a type of output processing applied to produce images of good 
viewing quality It is possible to design viewer software that can take image files which have 


undergone input processing and process them for output on a monitor 


Brightness Resolution 


The most widely used values for bit-depth equivalency of digital images is still 8 bits per 
pixel for monochrome images and 24 bits for color images. These values are reasonably accu- 
rate for good-quality image output. Eight bits per channel on the input side is not sufficient 
for good-quality scanning of diverse originals. To accommodate all kinds of originals with 
different dynamic ranges, the initial quantization on the CCD side must be larger than 8 bits. 

Tone and color corrections on 8-bit images should be avoided. The existing levels are 
compressed even further, no matter what kind of operation is executed. To avoid the loss of 
additional brightness resolution, all necessary image processing should be done on a higher 
bit-depth file, and requantization to 8-bit images should occur after any tone and color 
corrections. 

Correct brightness resolution is as much a part of overall digital image quality as is 


spatial resolution. Neither one can be neglected. 
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What is Digital Resolution? 


Why do we measure resolution? We do 
so, first, to make sure that the information 
content of the original image is represented in 


the digital image and, second, to make sure 


that the scanning unit used to digitize the 


(A) (B) image is in focus. 
NE SOY Oa a Pavataratatavararararararar ara Unlike photographic resolution, digital 
{ {| {| I lll resolution does not depend on visual detection 


O04 5 


or observation of an image. Digital resolution 


Fig. 12 The number of sampling points within a is calculated directly from the physical center- 
given distance is referred to as the device’s digital 


resolution. In case (A) the digital resolution is low, tO-Center spacing between each sample or dot. 
and not all the image information will be included This spacing is also called the sampling interval. 
in the digital file. Case (B) is sampled with a ; ; a 
higher resolution. A low-resolution derivative that The number of sampling points within a 
ts calculated from this high-resolution file (with a 
resampling algorithm) will have a higher quality , e , , 
than the image originally scanned at low resolu- as the device’s digital resolution, or pixels per 
tion. This is because in case (B) all of the image 
information is included in the original file. 


given distance (usually an inch) is referred to 


inch (ppi). Digital resolution quantifies the 
number of sampling dots per unit distance 
while photographic resolution quantifies observed feature pairs per unit distance as in line 
pairs per millimeter (Ip/mm). 

Translating between units of classical resolution and digital resolution is simply a matter 
of “two.” Dividing digital resolution values in half will yield units that are equivalent to 
photographic resolution. But there are conceptual differences between the two that have to 
be kept in mind when using digital resolution. Aliasing is the primary problem to be taken 
into account here; furthermore, a misregistration between image details and image sensors 
may give the impression that a certain device has less resolution than it actually has. 

If the sampling interval is fine enough to locate the peaks and valleys of any given sine 
wave, then that frequency component can be unambiguously reconstructed from its sampled 
values. Aliasing occurs when a wave form is insufficiently sampled. If the sampling is less 
frequent, then the samples will be seen as representing a lower-frequency sine wave. 

The most noticeable artifact of aliasing is high spatial frequencies appearing as low spatial 


frequencies. After the wave form has been sampled, aliasing cannot be removed by filtering. 
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Original sampled Undersampled image line 
image line showing spatial aliasing 


Fig. 13 Aliasing occurs when a 
wave form is insufficiently 
sampled (dotted line). The 
samples will show up as represent- 
ing a lower-frequency sine wave 
(i.e. the waves are further apart). 


Fig. 14 Misregistration between 
the detectors and the lines of the 
target by half a pixel can lead to 
the situation where the black-and- 


white lines of the original cannot 
t t t 4 t 4 4 4 4 4 be resolved and will look like a 
gray field in the digital image (all 
as 128 128 128 128 128 the digital values are the same, 
Digital Value Digital Value i.e., 128). 


In an ideal scan, the detectors and the lines of the target are perfectly aligned. The 
concept of misregistration can be easily shown by scanning a bar target. The detectors will 
only sense the light intensity of either the black line or the white space. If there is a misregis- 
tration between the centers of the lines and spaces relative to the detector centers, say by 
half a pixel, the outcome is different. Now each detector “sees” half a line and half a space. 
Since the output of every detector is just a single value, the intensities of the line and the 
space are averaged. The resulting image will therefore have the same digital value in every 
pixel. In other words, it will look like a grey field. The target would not be resolved. There- 
fore, the misregistration manifests itself as a contrast or signal loss in the digital image 
which affects resolution. Since it is impossible to predict whether a document’s features will 
align perfectly with the fixed positions of a scanner’s detectors, more than two samples per 


line pair are required for reliable information scanning. 


How Do You Measure Digital Resolution? 


The fundamental method for measuring resolution is to capture an image of a suitable 
test chart with the scanner being tested. The test chart must include patterns with suffi- 


ciently fine detail, such as edges, lines, square waves, or sine-wave patterns. 
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1mm 


3 Spatial 
Amplitude frequency 
= cycles/mm 


Fig. 15 A wave is characterized by its amplitude and its spatial frequency. 


The MTF 


The best overall measure of detail and resolution is the MTF (modulation transfer 
function). MTF was developed to describe image quality in “classical” optical systems. The 
MTF is a graphical representation of image quality that eliminates the need for decision- 
making by the observer. The test objects are sine-wave patterns. The MTF is a graph that 
represents the image contrast relative to the object contrast on the vertical axis over the 
range of spatial frequencies on the horizontal axis, where high frequency in the test target 
corresponds to small detail in an object (see Fig. 20). MTF is complicated to measure in 
images on photographic materials, but it is relatively easy to measure in digital images. 


Output Modulation 


Input Modulation [across a range of frequencies} 


Modulation Transfer Function = 


If MTF is measured for a sampled-data system, the measured MTF will depend on the 
alignment of the target and the sampling sites. An average MTF can be defined, assuming 
that the scene being imaged 1s randomly positioned with respect to the sampling sites. 

There are two approaches to defining the MTF of an imaging system. One is to use a 


sine-wave pattern, the other is to use a slanted edge. In the latter case, pixel values near 


Scanner Lens 


Fig. 16 The sine waves of the test target are 
ingen : scanned and translated into digital values. If you 
were to measure how dark or light the image was at 
every point along a line across the bars, the plot of 
255 these points would be a perfect sine wave. If the 
Pixel Values in Digital Image digital values along the same line are plotted, the 
(Output) resulting figure is also a sine wave. 
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Digital image Fig. 18 As the bars of the sine-wave target get 
(Output) closer together at higher frequencies, the modu- 
lation (i.e., variation from black to white) that is 


Fig. 17 Input modulation/output modulation recorded by the scanner gets smaller and smaller. 


Mathematical MTF MTF 

transformation 

of T values N 

Frequency Cut-Off 
IE Frequency 

Fig. 19 Calculating the MTF using the moving knife- Fig. 20 Graph of the MTE The MTF 
edge method. Pixel values across a slanted edge are shows the performance of the scanning 
digitized and, through a mathematical transformation system over the whole range of frequencies 
of these values into the Fourier domain, the MTF of the (in the target represented by sine waves 
system can be calculated. that are closer and closer together). The 


cut-off point of the system represents the 
highest frequency (the finest image details) 
that the scanner is able to resolve. 


slanted vertical and horizontal black-to-white edges are digitized and used to compute the 
MTF values. The use of a slanted edge allows the edge gradient to be measured at many 
phases relative to the image sensor elements, in order to eliminate the effects of aliasing. 
(This technique is mathematically equivalent to performing a “moving knife-edge measure- 
ment.” 

In the case of bitonal scanning (e.g., microfilm) it does not make sense to calculate the 
MTE In this case, the QI approach using test targets with fonts may be used. To qualify the 
system, running the scanner in gray scale mode might be considered. 

The QI approach may also be used to check whether Photo CD quality is good enough 


for 11 x 14-inch originals (Civil War photos, 11 x 7 inches mounted on boards approxi- 
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Fig. 21 Scanning from original or intermediate? Scan on left is from an 8x10 
original negative, scan on right is from a 35mm intermediate. 


mately 11 x 14 inches in size, that include fine text). 

A recommended approach is to take a 35mm photograph from the original and scan it 
onto a Photo CD, then make a legibility test. 

In addition, if a reformatting step onto film is involved, film type differences might 
introduce problems in trying to keep the color consistent. 

An earlier project conducted in part at IPI (together with Stokes Imaging) showed that 


if it is possible to scan directly from the original the results are better. 


Targets to Use 


What kind of targets can be used to measure digital resolution? Usually the targets 
designed for measurement of photographic resolution are used. These “bar targets” all have 
the problems of aliasing and misregistration in the digital image. However, bar targets can 
be used to check resolution visually and obtain the “cut-off frequency,” e.g., the highest 
frequency the system is able to resolve. Visually checking bar targets is not an easy task; the 
observer must know what to look for. 

Therefore, to measure digital resolution of sampling devices another approach has to be 
taken using slanted edges or sine-wave patterns. 

The scanned test target should be evaluated with a software program (to be devel- 
oped). This idea has received very positive reactions from the various people contacted 
about it. Having an objective tool to compare different scanning devices will be more and 
more important. Up to now scanner manufacturers usually have used their own software 


when determining the spatial resolution or MTF of their systems. 
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AIIM SCANNER TEST CHART# 2 
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MOLE TEST CHART 


Fig. 22 [EEE Std 167A-1987 facsimile test Fig. 23 Upper part of AIIM scanner test chart #2 
chart designed for use with facsimile containing different typefaces for “legibility tests.” 
machines. It is produced photographically 

and includes gray-scale bars, test, rules, and 

a continuous-tone image. It also incorpo- 

rates traditional line-pair patterns. 
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Fig. 24 Scanner test target PM-189. It incorporates both the AIIM 
target and the IEEE facsimile test chart. 
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Federal Reserve 


Quality Index Test Image 


System 

Point Size Quality Index Lower Case “e” 
High Medium Marginal Height Sample 

6 ts ui 10mm — «æ 

7 li 1.2 mm see 

8 I= 1.3 mm css. 

9 = = 1.5mm over 

10 tad li 1.7 mm vore 


Fig. 25 Quality Index test image to calculate QI (Picture Elements, Incorporated). 
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Fig. 26 RIT Alphanumeric test 
target. It consists of lines of block 
letters. During inspection, an observer 
must recognize letters, rather than 
detect resolved line pairs. Can be used 
for “legibility tests.” 


Fig. 28 Bar targets with converging lines, like 
this star pattern, can be used to visually check 
the so called cut-off frequency of the system, i.e., 
the smallest features that can be resolved, but 
they can not be used to get information on how 
the system 1s working for all the different 
frequencies in the image. 


Fig. 27 High-contrast bar targets with converging 
lines can be used to visually check the so-called cut-off 
frequency of the system (1.e., the smallest features that 
can be resolved), but they cannot be used to get 
information on how the system is working for all the 
different frequencies in the image (Picture Elements, 
Incorporated) . 
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Fig. 29 USAF Resolving Power test target. The table below contains 
the number of lines per millimeter related to the group number 
(number above a group of line pairs) and element numbers (number 
on the side of the line pairs) on the target. In order to avoid sam- 
pling effects, the target should be placed at an angle of 45 degrees to 
the CCD elements. This target also allows only the determination of 


the cut-off frequency. 


Conversion Table: 
Lines/Millimeter to Lines/Inch Lines/Millimeter in USAF Resolving Power Test Target 1951 


Lines/mm Lines/inch Element Group Number 


1.00 2.00 4.00 8.00 
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Fig. 31 Sine Patterns sine-wave target. The sine 
waves in the two center rows of the image are 


Fig. 30 Resolution test chart for electronic still used to calculate the MTE The MTF shows how 
photography. The black bars are used to calculate much of the modulation that was in the original 
the modulation transfer function applying the image made it into the pixel values of the 
moving knife-edge technique. scanned image of the target. 
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KODAK -69 Color Input Torger 
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For future use (including color) Kodak’s Q-60 


test chart should be included. 


Digital Resolution—Do You Get What You Ask 
For? 


How can you make sure that the digital files you 
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get have been actually scanned at the desired resolu- 


tion and not sampled up or down from another meya O e iz = 


ILARA a Prdsss lasa! 2 des 
ion? 
eolon; Fig. 32 Q-60 test target as reference for 
A helpful method for finding out whether any color images. 


resampling was done to your images is calculating the frequency spectrum of the image. 


Depending on the kind of resampling that was done, the frequency spectra will look different. 


5 cycles/mm sine wave frequency spectrum 
Fourier 
lL 
Fig. 33 Ideal frequency spectrum of a pure 
sine wave. Three spikes show up in the 
Fourier transform image. -10 -5 0 5 10 cycles/imm 


Fig. 34 Sine-wave targets rescaled to 500 
ppt by different rescaling techniques. The 
uppermost image row shows true 500 ppi 
scanning (no rescaling). Depending on 
the rescaling function, the sine-wave 
images show banding. This can be easily 
detected looking at the enlargement of the 
Fourier transform image that shows five 
spikes after rescaling. Depending on the 
rescaling (or resampling) technique used, 
the spectra will look different. 
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Related Quality Questions 


Compression 


For many years, storage space has been one of the major concerns when dealing with 
digital images. Advances in image-data compression and storage-media development have 
helped to reduce this concern. Nevertheless, image compression in an archival environment 
has to be evaluated very carefully. The image deterioration caused by the compression 
scheme limits the future use of a digital image. Because that use is not yet clear, one copy of 
every image should be compressed using a lossless compression scheme—even if it means 
paying the price of a lower compression ratio. Advances in image processing and pattern 


recognition will lead to better compression schemes in the future. 


Lossless and Visually Lossless Compression 


All good compression schemes sacrifice the least information possible in achieving a 
reduced file size. Lossless compression makes it possible to exactly reproduce the original 
image file from a compressed file. Lossless compression differs from visually lossless com- 
pression, which is compression where the artifacts are not visible. Although the human 
visual system provides guidelines for designing visually lossless compression schemes, ulti- 


mately the visibility of compression artifacts depends on the output. 


LZW (Lossless Compression) 
LZW (Lempel-Ziv-Welch) is a type of entropy-based encoding and belongs to a class of 


lossless compression that is performed on a digital image file to produce a smaller file which 
nevertheless contains all the information of the original file. For example, if an image con- 
tains a large red area, it may require less space to describe the size, shape, and color of that 
area than to individually specify the color of each pixel in that area. Entropy-based schemes 
are particularly effective on images which contain large blocks of smooth tones. Currently, 
the most common schemes are those based on Huffman encoding and the proprietary LZW 
compression (used for TIFF files in Adobe Photoshop). 

The granular structure of film, unfortunately hinders effective entropy-based encoding. 
The film grain imposes a fine random noise pattern on the image that does not compress 


well. There is currently no effective lossless way to deal with this problem. 
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Photo CD (Visually Lossless) 


The Photo CD compression scheme utilizes both frequency and color compression in 
an attempt to produce visually lossless compression. A fundamental difference between 
Photo CD compression and JPEG compression is the fact that in the former the image is 
not broken up into blocks. Since the blocks can be a major source of visible artifacts with 
JPEG compression, this difference results in the Photo CD scheme having a decided advan- 
tage in terms of image quality. The disadvantage is a significant increase in computational 
requirements. It may take several minutes to uncompress a full-resolution Photo CD image 
on a “small” computer. The Photo CD system can implement whole-image transformation 
partly because there is an upper limit on the file size. Also, the image information in a Photo 
CD file is arranged in a pyramidal structure, with video and lower-resolution files being 
stored in uncompressed form. If a computer with limited computational capability is used 
to open a file, it is possible to access subsets of the data which are ordered according to 
spatial frequency This structure allows a file to be opened at a lower resolution in a much 
shorter time because computations are performed only as required to produce the desired 
image resolution. 

It has recently become possible to use Photo CD files the World Wide Web, and they 
can be easily incorporated into HTML files. 


Lossy Compression 


JPEG stands for Joint Photographic Expert Group, which is the group responsible for 
the development of the compression approach named after it. JPEG is one type of lossy 
compression with a number of user-selectable options. It divides the image into a number of 
8 x 8-pixel blocks and then performs a Discrete Cosine Transform (DCT) on these blocks to 
transform the data into frequency space. The frequency space data is then requantized to a 
selectable number of bits, resulting in a significant reduction in the bit depth at high spatial 
frequencies. 

The advantages of JPEG compression are its user selectability to ensure visually lossless 
compression, high compression ratio, good computational efficiency, and good film grain 
suppression characteristics. Future development proposed for the JPEG standard allow for 
tiling extensions, meaning that multiple-resolution versions of an image can be stored 


within the same file (similar to the concept behind the Photo CD files). 
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Fig. 35 JPEG lightly com- 
pressed (4:1/5:1) can already 
show some compression arti- 
facts. Therefore the highest 
quality master file should be 
archived using a lossless 
compression algorithm. 


The concern that repeated JPEG compression causes deterioration of image quality is 


valid. Consequently, all image processing should occur before the file is compressed, and the 
image should only be saved once using JPEG. Because the Library does not yet know the 
future use of the images (e.g., for reproduction) and the type of image processing that will 
be involved, one high-quality image file (a master) should be retained in a lossless compres- 


sion scheme. 


Monitor Calibration 


A common problem when using different computer systems/monitors in an environ- 
ment is the difference between the images when viewed on the various systems/monitors. In 
recent scanning projects this was a problem for the Library staff not only when working 
with images but also when discussing the quality of scans with vendors over the telephone, 
because the two parties did not see the same image. Therefore, it is a requirement to cali- 
brate all the monitors. Nevertheless, one should be careful to base discussions entirely on 


images on the monitor. 
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The keys to calibrating a monitor are to set the gamma and white point. A monitor’s 
gamma is a measure of the response curve of each of the red, green, and blue channels, from 
black to full intensity: Typical gamma values for color monitors are in the range from 1.8 to 
2.2. The white point of a monitor is the color of white produced when all three color chan- 
nels are at full intensity It is specified as a color temperature, measured in Kelvin (with 
images getting bluer as their color temperatures rise). There exist various calibration tools 
that differ widely in complexity. Some application programs incorporate basic monitor 
calibration. Furthermore, there exist specific calibration programs. Depending on the need 
of the user they can be very sophisticated and incorporate devices like photometers and 
colorimeters. 

The best way to view a monitor is under dim illumination that has a lower correlated 
color temperature than the monitor. This reduces veiling glare, increases the monitor dy- 
namic range, and results in the human visual system adapting to the monitor. This viewing 
condition results in the most aesthetically pleasing monitor images. The situation gets more 
problematic if originals and images on the screen are viewed side-by-side, because in this 
case the observers are not allowed to adapt to each “environment” individually. 

Once calibrated, the monitor should need recalibration only when conditions change, or 
on a monthly basis. It is a good idea to put a piece of tape over the monitor’s brightness and 


contrast controls after calibration and to maintain consistent lighting conditions. 


Suggested Hardware (Including Monitor Calibration Tool) 


To judge images on a monitor, a good monitor/video card combination should be 
available. The PC should have enough RAM to be able to work efficiently with the highest- 
resolution images. IPI worked with the following equipment: 

Nanao FX2-2] Monitor 

#9 Imagine Pro 8MB Graphic Card 

For a calibration tool, IPI recommends Colorific from Sonnetech, Ltd. (included with 
the monitor suggested above). 

Intel Pentium 200Mhz motherboard 

64MB of RAM 

Adaptec wide SCSI controller 

Seagate 4.2GB wide SCSI hard drive 

Plextor 6x CD ROM drive 
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HP ScanJet 3c flatbed scanner with an optical resolution of 600 dpi 

Kontron Digital Camera ProgRes 3012, highest spatial resolution 3000 x 2300 pixels 
(scans provided by JJT Consulting, Inc., Thorndale, TX) 

For documents, image quality should also be tested on printed output from different 
printers 

The following software packages have been used for the interpretation and creation of 

the images: 

Adobe Photoshop 3.0 

Impact Professional 

IDL 4.0 


Closing Note 

The quality review of scanned images incorporates much more than a specification 
issue, e.g., how many ppi are required for scanning the documents. Most of the major 
scanning projects that scan for archiving are now going through this phase of setting up a 
complete framework. By further investigation of such a framework, the Library can take a 


leadership position in the field. 
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Appendix: Scanning Target Sources 


AIIM Scanner Test Chart #2/IEEE Std 167A-1987 


AIIM, Association for Information and Image Management 


1100 Wayne Avenue, Suite 1100 
Silver Spring, MD 20910 
Fax 301-587-2711 


Quality Index Test Image/High Contrast Resolution Test Image 


Picture Elements, Inc. 
777 Panoramic Way 
Berkeley, CA 94704 


Resolution Test Chart for Electronic Still Photography 


ISO/TC42 


WG18/“Electronic Still Picture Imaging” 


RIT Alphanumeric Test Target 
Rochester Institute of Technology 
Research and Testing 


Technical and Education Center of the Graphic Arts 


66 Lomb Memorial Drive 
Rochester, NY 14623-5604 
Fax 716-475-6510 


Scanner Test Target PM-189 
A&P International 
2715 Upper Afton Road 
St. Paul, MN 55119-4760 
Phone 612-738-9329 
Fax 612-738-1496 


Sine Wave Pattern 
Sine Patterns 
236 Henderson Drive 
Penfield, NY 14526 
Phone 716-248-5338 
Fax 716-248-8323 
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